Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[70B-Part2] Improved save model (that can work with FSDP) #107

Merged
merged 4 commits into from
Sep 16, 2024

Conversation

farzadab
Copy link
Contributor

@farzadab farzadab commented Sep 10, 2024

The fixes in this PR might not make much sense on their own, but here's what's changing:

  1. no_split_modules is getting set dynamically. Apparently the previous approach (of including all possible classes) leads to an error since all classes are expected to be present.

  2. state_dict and load_state_dict logic are slightly modified in how they're applied.
    a. using existing register_hook method instead
    b. changing save_pretrained instead of state_dict. TODO: This might end up fixing some of the warnings we were seeing and suppressing as well (not tested yet).

  3. train.py reverts to using trainer.save_model instead of pipeline (in order to work with FSDP), but we will still save the pipeline code and configs.

@farzadab farzadab changed the base branch from farzad-fsdp-p1 to main September 10, 2024 20:22
ultravox/training/train.py Outdated Show resolved Hide resolved
ultravox/model/ultravox_model.py Show resolved Hide resolved
ultravox/training/train.py Outdated Show resolved Hide resolved
@farzadab farzadab enabled auto-merge (squash) September 13, 2024 22:46
@farzadab farzadab disabled auto-merge September 13, 2024 22:51
@farzadab
Copy link
Contributor Author

@juberti @zqhuang211 are there any more comments?
If not, please approve so we can merge this.

ultravox/training/train.py Outdated Show resolved Hide resolved
@farzadab farzadab enabled auto-merge (squash) September 16, 2024 20:12
@farzadab farzadab merged commit b12be46 into main Sep 16, 2024
1 check passed
@farzadab farzadab deleted the farzad-fsdp-p2 branch September 16, 2024 23:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants